PCOS Analysis!!

Agnes Lorenzen, Cecille Hobbs, Freja E. Klippmann, Julie Dalgaard Petersen & Mille Rask Sander

Introduction

Background

  • Polycystic ovary syndrome (PCOS) is a syndrome documented in women in their menstruating ages

  • Documented symptoms are often; period pains, irregular periods, ovary related problems and hormone imbalance

  • Patients with PCOS often have problems with pregnancy and potential complication with/in pregnancy

  • However, it is still not verified what the cause of PCOS is.

Aim

The aim of this study is to examine a data set (found on Kaggle) of patients with and without PCOS. The data set has been made in India and data comes from 10 different hospitals.

Data handling approach

  • Raw data:
    541 observations divided into 45 variables

  • 01_load_data:
    Simply loads the data

  • 02_clean_data:

    • Fixing random cells and replacing them with NA
    • Rename & factorizing columns
    • Split dataframe into body and blood measurements
    • Removed empty column
  • 03_augment:

    • Unit changes ( inch to cm)
    • Rounding & grouping BMI
    • Change Blood type and cycles from numeric values to characters
    • Create new column for cycle/ pregnancy stage
    • Merging data frame into one file

# Rounding of BMI and dividing into categories
body_measurements <- body_measurements |>
  mutate(BMI = round(BMI, 1)) |> 
  mutate(BMI_class = case_when(
    BMI < 18.5 ~ "Underweight",
    BMI <= 18.5 | BMI < 25 ~ "Normal weight",
    BMI <= 25 | BMI < 30 ~ "Overweight",
    BMI >= 30 ~ "Obesity")) |>
  mutate(BMI_class = factor(BMI_class,
                            levels =  c("Underweight", 
                                        "Normal weight",
                                        "Overweight", 
                                        "Obesity"))) |>
  relocate(BMI_class, .after = BMI)

Descriptive analysis of data

Dimensions:

# A tibble: 2 × 1
  `PCOS dimensions`
              <int>
1               541
2                44

Count of how many have PCOS:

# A tibble: 2 × 2
  PCOS_diagnosis     n
  <chr>          <int>
1 No               364
2 Yes              177

Body measurement - Follicle number

In this analysis, we have been looking at the correlation between PCOS diagnosed patients, and what factors they potentially have in common from the body measurements data.

PCA of blood measurements

No diverging of PCOS diagnosed individuals compared to non-PCOS diagnosed individuals.

PCA of body measurements

Discussion

blah blah blah

  • hdsjkdhs

  • ndsjkandlas

  • sajlda

Conclusion

  • no significance